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Abstract — We address the question whether quantum memory 
is more powerful than classical memory. In particular, we 
consider a setting where information about a random n-bit string 
X is stored in s classical or quantum bits, for s < n, i.e., the 
stored information is bound to be only partial. Later, a randomly 
chosen predicate F about X has to be guessed using only 
the stored information. The maximum probability of correctly 
guessing F(X) is then compared for the cases where the storage 
device is classical or quantum mechanical, respectively. We show 
that, despite the fact that the measurement of quantum bits can 
depend arbitrarily on the predicate F, the quantum advantage 
is negligible already for small values of the difference n — s. Our 
setting generalizes the setting of Ambainis et al. who considered 
the problem of guessing an arbitrary bit (i.e., one of the n bits) 
ofX. 

An implication for cryptography is that privacy amplification 
by universal hashing remains essentially equally secure when 
the adversary's memory is allowed to be quantum rather than 
only classical. Since privacy amplification is a main ingredient 
of many quantum key distribution (QKD) protocols, our result 
can be used to prove the security of QKD in a generic way. 

Index Terms — Cryptography, privacy amplification, quantum 
information theory, quantum key distribution, quantum memory, 
security proofs, universal hashing. 



I. Introduction 

It is a well-known fact that in s quantum bits one cannot 
reliably store more than s classical bits of information. 1 In 
other words, the raw storage capacity (like the raw trans- 
mission capacity) of a quantum bit is just one bit of infor- 
mation. However, since quantum memory can be read by an 
arbitrary measurement determined only at the time of reading 
the memory, quantum memory can be expected to be more 
powerful than classical memory in any context where a string 
X of n > s bits of information is given (and hence can be 
stored only partially) and it is determined only later which 
information about X is of interest. 2 

The simplest setting one can consider is that one must 
use the stored information to guess F(X) for a randomly 
chosen predicate F : X — > {0, 1}. Ambainis, Nayak, Ta- 
Shma, and Vazirani [4], [5] were the first to study such 
a setting for the special case where X is an n-bit string 
and F(X) is an actual bit (i.e., one of the n bits) of X. 
Because in the quantum case one can let the measurement 
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'This is a direct consequence of the Holevo bound [1] stating that the 
accessible information contained in a quantum state cannot be larger than its 
von Neumann entropy. This assertion is also a consequence of the general 
results proven in this paper (cf. Section HV-CI . 

2 A typical example of such a setting is the bounded-storage model [2], [3], 



of the stored quantum bits depend arbitrarily on F, while in 
the classical case one can only read the stored information, 
quantum memory is potentially more powerful. However, we 
prove that having information about X stored in s quantum 
instead of s classical bits is essentially useless for guessing 
F(X), even for optimal quantum storage and measurement 
strategies. This is in accordance with the results in [4], [5] 
as well as with recent results on communication complexity 
(see e.g., [6]) where the power of classical and quantum 
communication is compared. 

In a cryptographic context, our results can be applied to 
the security analysis of cryptographic primitives in a context 
where an adversary might hold quantum information. An 
important example is privacy amplification introduced by 
Bennett, Brassard, and Robert [7] (see also [8]) which is a 
protocol between two parties, Alice and Bob. The goal is to 
turn a common n-bit string X, about which an adversary Eve 
has some partial information, into a highly secure fc-bit key 
K. This can be achieved as follows: Alice and Bob publicly 
agree on a function G : {0,1}™ — > {0, l} fe chosen from 
a two-universal class of hash functions 3 and then compute 
K = G(X). A It has been shown that, if Eve's information 
about X consists of no more than s classical bits, the final 
key is secure as long as k < n — s. 5 

Similar to the previously described setting, it seems to 
be a potential advantage for the adversary to have available 
s quantum instead of s classical bits of information about 
X because she later learns the function G and can let her 
measurement of the s quantum bits depend on G. This may 
allow her to obtain more information about the final key K. 
We prove that this is not the case, i.e., privacy amplification 
remains equally secure against adversaries holding quantum 
information. 

This has interesting implications for quantum key distribu- 
tion (QKD): In a QKD protocol, Alice and Bob first exchange 
quantum information (e.g., polarized photons) to generate a 
raw key X which is only partially secure, i.e., Eve has some 
quantum information p about X. In a second (purely classical) 
phase, Alice and Bob apply privacy amplification to generate 
the final secret key K. Our result on the security of privacy 
amplification thus reduces the problem of proving the security 
of a QKD protocol to the problem of finding a bound on the 
number of qubits needed to (reliably) store Eve's information 
p. In [10], this fact has been exploited to show the security 
of a generic QKD protocol which, in particular, implies the 
security of many known protocols such as BB84 [11]. This 
simplifies and generalizes 6 known security proofs (see e.g., 

3 See Section Hl-Al for a definition of two-universality. 
4 Equivalently, they can use an extractor [9]. 

5 More precisely, her information is exponentially small in n — s — k. 
6 Most known security proofs are restricted to one specific QKD protocol. 
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[12]) which are based on completely different techniques. It 
also generalizes a proof by Ben-Or [13] which is based on 
a similar idea using results from communication complexity 
theory [14]. 

The paper is organized as follows. In Section [TO] we 
introduce a general framework for modeling and quantifying 
knowledge and storage devices. The framework is then used in 
Section lTVl to state and prove bounds on the success probability 
when guessing a binary predicate F of X given information 
about X stored in a quantum storage device (Section llV-Bl i. 
These are then compared to the situation where the information 
about X is purely classical (Section IIV-Q . In Section [V] 
the results are extended to non-binary functions which then 
allows for proving the security of privacy amplification against 
quantum adversaries (Section IV-B> . 

II. Preliminaries 

A. Notation 

Let T(X — > y) be the set of functions with domain X 
and range y. The set T(X — > {0, 1}) of binary functions 
with domain X, in the following called predicates on X, is 
denoted as T^. Similarly, := {/ € J* : \f-\W)\ = 
1/ i s tne set °f balanced predicates on X. 

Throughout this paper, random variables are denoted by cap- 
ital letters (e.g., X), their range by corresponding calligraphic 
letters (X), and the values they take on by lower case letters 
(x). The event that two random variables X and Y take on the 
same value is denoted as X = Y. In contrast, we write X = Y 
if two random variables X and Y are identical (i.e., if X = Y 
always holds). The expectation E x< _p x [f(x)] of a function / 
on the random variable X is given by 2~2xeX Px{x) f(x). 

For a channel C from S to W and a random variable S on 
S, we denote by Cs the output of C on input S, i.e., if the 
channel is defined by the conditional distributions Pw\s=s f° r 
s € S, the joint probability distribution of Cs and S is given 
by P CsS {w, s) = P{s)P Csls=s {w) for all (w, s)eWx S. 

A random function G from X to y is a random variable 
taking values from the set T(X — ► y) of functions mapping 
elements from X to y. The set of random functions from X to 
y is denoted as 1Z(X -► y). If G € -> 3>) is uniformly 
distributed over JF(A" — > it is called a uniform random 
function from A" to 3^- Similarly, a (uniform) random predicate 
F on A" is a random function with (uniform) distribution over 
the set !F^ n , and a (uniform) balanced random predicate is 
(uniformly) distributed over the set T* &v In the sequel, we 
will only use random functions which are independent of all 
other (previously defined) random variables. 

A random function G from X to y is called 7 two-universal 
if P g ^p G [g(x) = g(x')\ < l/\y\ holds for any distinct x, x' £ 
X. In particular, G is two-universal if, for any distinct x, x' £ 
X, the random variables G(x) and G(x') are independent and 
uniformly distributed. For instance, a uniform random function 
from X to y is two-universal. Non-trivial examples where 

7 In the literature, two-universality is usually defined for families Q of 
functions: A family Q is called two-universal if the random function G with 
uniform distribution over Q is two-universal. For our purposes, however, our 
more general definition is more convenient. 



the distribution of G is over a smaller set of function (thus 
requiring less randomness) can, e.g., be found in [15] and [16]. 

B. Distance from Uniform 

The variational distance between two distributions P and 
P' over an alphabet Z is defined as 

6(P,P>) :=i^|P(z)-P'(z)| . 

z£Z 

The variational distance S(P,P) of a distribution P from 
the uniform distribution P (over the same alphabet Z) is of 
particular interest in cryptographic applications. We will use 
the abbreviation d(P) for this quantity and refer to it as the 
distance of P from uniform. For the distance of the distribution 
of a random variable Z from uniform, we also write d(Z) 
instead of d{Pz), and, more generally, for any event £, 
d(Z\£) := d(P z \e)- Note that d is a convex function, i.e., 
for two probability distributions P and P', and q, q' G [0, 1] 
with q + 4 = 1, we have d(q P + q' P')<q d(P) + q' d(P'). 

The distance d(Z) of a random variable Z from uniform 
has a natural interpretation: It equals the probability that Z 
deviates from a uniformly distributed random variable Z, in 
the following sense. 

Lemma 1: For any probability distribution Pz on Z there 
exists a channel P Z \z sucn mat Pz 15 tne uniform distribution 
on Z and P( z ^p z2 [z = z] = 1- d{Z). 

For two random variables Z and W, the (expected) distance 
of Z from uniform given W is defined (cf. [2]) as the 
expectation of the distance of Z from uniform conditioned on 
W, i.e., d{Z\W) := E w ^ Pw [d(P Z \w=w)}- It follows directly 
from the convexity of d that d(Z\W) > d(Z), and, more 
generally, for an additional random variable V and an event 
£, d{Z\WV,£) > d(Z\V,£). 

III. Modeling Knowledge and Storage 
A. Knowledge and Guessing 

Let Z be a random variable and let A be an entity 
with knowledge described by a random variable W (jointly 
distributed with Z according to some distribution Pzw)- 
Intuitively, one would say that A knows nothing about Z 
if Z is uniformly distributed given knowledge W, i.e., 
Pzw = Pz x P\v where Pz is the uniform distribution. The 
following straightforward generalization of Lemma ^ suggests 
that the distance d(Z\W) of Z from uniform given W can be 
interpreted as the probability of deviating from this situation. 

Lemma 2: For any probability distribution Pwz on W x Z 
there exists a channel Pz\wz sucn tnat Pz 15 tne uniform 
distribution on Z, P Z w = Pz x Pw> an d JP(z,z)<-P - [ z = 
z] = l-d(Z\W). 

This is of particular interest in cryptography, where, for 
instance, A is an adversary with knowledge W and where 
one wants to use Z as a key. Typically, a cryptosystem based 
on a key Z is secure when Z is uniformly distributed and 
independent of A's knowledge. The lemma implies that, with 
probability 1 — d(Z\W), Z is equal to such a perfect key Z. 
This means that any statement which is true for an ideal setting 
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where Z is used as a key automatically holds, with probability 
at least 1 — d(Z\W), for a real setting where Z is the key. 

The distance from uniform d(Z\W) is also a measure for 
the maximum success probability P guoss (Z\W) of an entity A 
knowing W when trying to guess Z, 

P g ucss(Z\W) := max P [C w = z] , 

C (w,z)<—Pwz 

where the maximum is over all channels C from W to Z. 8 
The following lemma is an immediate consequence of the 

simple fact that the best strategy for guessing Z given W = w 

is to choose a value z maximizing the probability P z \ w {z\w). 
Lemma 3: Let W and Z be random variables. Then 

P guC ss(Z\W) < i + d(Z\W) where equality holds if Z is 

binary. 

B. Selectable Knowledge 

The characterization of knowledge about a random variable 
Z held by an entity A in terms of a random variable W is 
sufficient whenever this knowledge is fully accessible, e.g., 
written down on a sheet of paper or stored in a classical storage 
device. However, in a more general context A might have an 
option as to which information she can obtain. For example, 
if her information about Z is encoded into the state p of a 
quantum system, she may select one arbitrary measurement 
to "read it out". Formally, every measurement corresponds to 
a channel W from the state space of the quantum system to 
the set of possible measurement outcomes. The situation is 
thus completely characterized by the set of measurements (that 
is, channels) W and the joint distribution of Z and p. This 
setting is discussed in detail in Section UlI-DI Another (more 
artificial) example might be a storage unit which can hold two 
bits S = B1B2, but which allows only to read out one of these 
bits, i.e., A can read either the value B\ or B>i- In this case, the 
situation is described by the joint distribution of Z and S and 
the set of channels {pi,p2}, where channel pi maps (61,62) 
to hi for i = 1,2. To model these situations, it is useful to 
introduce the following notion. 

Definition 4: A selectable channel W on S with range W 
is a set of channels from S to W. 

Consider now a setting as described above, i.e., there is a 
system which is in a state described by a random variable S 
on S, and an entity A has access to S by means of a channel 
W from a set W. In the following, we say that an entity A 
has selectable knowledge W5, meaning that A can learn the 
value of exactly one arbitrarily chosen random variable Ws 
with W G W. The knowledge of A about a random variable 
Z can then be quantified by a natural generalization of the 
distance measure introduced above. 

Definition 5: Let S and Z be random variables and let W 
be a selectable channel on the range of S. The distance of Z 
from uniform given Ws, is 

d(Z\W s ) := max d(Z\W s ) . 
The significance of this generalized definition of distance 
from uniform, e.g., in cryptography, is implied by a straight- 
forward extension of Lemma |2] 

8 Recall that C\y denotes the output of the channel C on input W . 



Lemma 6: Let S and Z be random variables and let W be 
a selectable channel on the range of S. Then for any choice 
of an element W of W, there exists a random variable Z 
defined by a channel Pz\w s z> sucn mat P% 15 me uniform 
distribution on Z, P WsZ = Pw s x Pz> and ^{z s)^p Zz [z = 
z]>l-d(Z\W s ). 

Similarly, Lemma [3] can be generalized to obtain a bound 
for the maximum success probability of an entity A with 
selectable knowledge Ws when guessing Z, 

P g ucss(Z\W s ) := max P gucss (Z\W s ) . 

Lemma 7: Let S and Z be random variables and let W be 
a selectable channel on the range of S. Then P gucss (Z|Ws) < 
■prj- + d(Z\Ws), where equality holds if Z is binary. 

Consider now a situation where the information about Z of 
an entity A is described by both some selectable knowledge 
Ws, and, additionally, a random variable U which she can use 
to choose an element from W. More precisely, she applies 
some channel C = Pyv\u from U to W to the random 
variable U and then chooses to learn Ws for the resulting 
W = Cu S W. We will then be interested in the maximal 
distance of Z from uniform resulting from an optimal strategy 
used by A. Such an optimal strategy consists simply of 
(deterministically) choosing some W G W which maximizes 
E w ^ Pws [d(P z \w s =w,u=u)}, given U = u. We thus introduce 
the following quantity. 

Definition 8: Let S, U and Z be random variables and let 
W be a selectable channel on the range of S. The distance of 
Z from uniform given Ws and U is defined as 

d(Z\Ws;U) := E [max d(Z\W s ,U = u)) . (1) 
It is easy to see that 

d(Z\W s ;U) = d(Z\V (s>u) ) 

for some selectable channel VoniSxW which models the fact 
that A can choose an arbitrary strategy. In particular, Lemma[6] 
and Lemma Q still hold when Ws is replaced by Wj;C/, 
where -P gU oss(^|Ws; U) is defined as the maximal probability 
of A when guessing Z in the situation described above. 

It is a direct consequence of the properties of the variational 
distance that knowledge of an additional random variable U 
can only increase the distance from uniform given selectable 
knowledge. 

Lemma 9: Let S, U and Z be random variables and let W 
be a selectable channel on the domain of S. Then 

d(Z\W s ;U) >d{Z\W s ) . 

C. Storage Devices 

A (physical) storage device is a physical system where the 
information it contains is determined by its physical state 
s. Information is stored in the device by choosing a state 
s from its state space S. A storage device might provide 
different mechanisms to read out this information, each of 
them resulting in some (generally only partial) information 
about its state s. However, any possible strategy of accessing 
the stored information can be described as a channel mapping 
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the memory state to a random variable W. We thus define a 
storage device with state space S and range W as a selectable 
channel p from S to W. 

As an example, consider the (artificial) storage device 
mentioned above which allows to store two bits, but where 
only one of them can be read out. Formally, this storage device 
is a selectable channel p = {pi,P2} from the state space 
S = {0, 1} x {0, 1} to the set {0, 1} where p m is the channel 
mapping (6i, 62) to b m , for m £ {1, 2}. 

The most trivial case is a classical storage device for storing 
s bits and allowing to read out all s bits without errors. 
Obviously, its state s can take one of 2 s possible values. 
Moreover, any accessing strategy corresponds to a channel 
with input s. Formally, a classical s-bit storage device is 
defined as the selectable channel C 2 containing all channels 
taking inputs from the set {0, (In Section Ull-DI we will 
give an analogous definition for quantum storage devices.) 
Note that for a random variable Z and a random variable S 
on {0, 1} S , d(Z\C 2 s s ) = d(Z\S). Thus we omit to mention 
the selectable channel if it is clear from the context, e.g., we 
write d(Z\S; U) instead of d(Z\C% \ U). 

D. Quantum Storage 

An s-qubit storage device is a quantum system of dimension 
d = 2 s where information is stored by encoding it into the state 
of the system. This information can (partially) be read out by 
measuring the system's state with respect to some (arbitrarily 
chosen) measurement basis. Each pure state of a d-dimensional 
quantum system corresponds to a normalized vector \ip) in a 
e?-dimensional Hilbert space Tid- Equivalently, the set of pure 
states can be identified with the set V(Ji.d) '•= {\ip){^\ '■ W) S 
7~Ldi |(^|^)| ~ 1} where \ip)(ip\ is the projection operator in 
Ttd along the vector \ip). The set of all possible states of 
the quantum system is then given by the set of mixed states 
S(Hd), which is the convex hull of V(7~Cd)- 

It is well known from quantum information theory that the 
most general strategy to access the information contained in 
a quantum system is to perform a positive operator-valued 
measurement (POVM), which gives a classical measurement 
outcome W. Any possible measurement is specified by a 
family {E w } w( zyv of nonnegative operators on fid satisfy- 
ing Siugw — id-H d - If the system is in state p, the 
probability of obtaining the (classical) measurement outcome 
w £ W when applying measurement {E w } weW is given by 
P{e w }{w\p) ■= tr(E w p). 

In the framework presented in the previous section, a d- 
dimensional quantum storage device Q d is thus defined as the 
set of channels P{e iv } describing all possible POVMs {E w } 
on a d-dimensional quantum state, i.e., 

Q d == {P{e w} ■ {E w } G POYM(H d )} ■ 

A general way of describing this setting is to define the 
state S of the storage device by a family of quantum states 
{Px}xex C S(TCd), where p x is the conditional state of the 
system given X — x, that is S = px- Similar to the notation 
introduced for classical storage devices C 2 , we will also write 
px instead of QjL . 



According to Definition [5] the distance d(Z\px) of a 
random variable Z from uniform given px can be written 

as 

d(Z\p x ) = maxd(Z\W) 

{E w } 

where the maximum is taken over all POVMs {E w } and 
where W is the measurement outcome of {E w } applied to 
the quantum state, i.e., Pw\x=x( w ) — tr(E w p x ). Similarly, 
for an additional random variable U, 

d(Z\p x ;U)= E [max d(Z\W,U = u)] 

u<—Pu J 

where, for each u, {E^} is a POVM and where W is defined 

by Pw\x=x,u=u{w) = tr(E^p x ). 

IV. Quantum Knowledge About Predicates 

A. The Quantum Binary Decision Problem 

We begin this section by stating a few known results 
about the so-called quantum binary decision problem, which 
are central to the proof of our main statements concerning 
quantum knowledge. 

Let po,p\ € S(TC) be arbitrary (mixed) states of a quantum 
mechanical system H, and suppose that the system is prepared 
either in the state p = po or in p = p\ with a priori 
probabilities q and 1 — q, respectively. The quantum binary 
decision problem is the problem of deciding between these 
two possibilities by an appropriate measurement. Any deci- 
sion strategy can be summarized by a binary valued POVM 
{Eq,Ei}, where the hypothesis Hi : p = pi is chosen 
whenever the outcome is i e {0, 1}. For a fixed strategy 
{Eq,E\}, the probability of choosing Hi, when the actual 
state is pj, is given by f[Hi\p = pj] = tr(EiPj), i,j S {0, 1}. 
Thus the expected probability of success for this strategy 
equals 

P^ El} {p^Pi) :=qtT(E po) + (l-q)tr(E lPl ) . 

The maximum achievable expected success probability in the 
binary decision problem is the quantity 

P g ma >o,/>i):= sup pt E °> E ^( Po , Pl ) . 

{£ ,£i}ePOVM 

The following theorem is due to Helstrom [17]. We state it 
using the notation of Fuchs [18] who also gave a simple proof 
of it. 

Theorem 10: Let po,pi £ S(Hd) be two states, let q £ 
[0, 1], and let {pi}f =1 be the eigenvalues of the Hermitian 
operator A := q P q— (1 — q) P %, Then the maximum achievable 
expected success probability in the quantum binary decision 
problem is 

P^(p , Pl ) = l - + l -Y^\p l \ . 

8=1 



5 



B. Bounds on Quantum Knowledge 

Let X be a random variable and let F be a randomly chosen 
predicate on X. The goal of this section is to derive a bound 
on the distance of F(X) from uniform given knowledge about 
X stored in a quantum storage device. 

Such knowledge is modeled by a family of quantum states 
{Px}x<£X> where p x is the state of the quantum system 
conditioned on the event that X = x. An explicit expression 
for the corresponding quantity can be obtained using a result 
on the quantum binary decision problem (cf. Section HV-A> . 

Lemma 11: Let X be a random variable with range X and 
let F be a random predicate on X. Let {p x }xtx C S(Hd) be 
a family of quantum states on a d-dimensional Hilbert space. 
Then 



d(F(X)\ Px ;F) 



- b y 



where {/i{}^ =1 are the eigenvalues of the Hermitian operator 



3 >3 



A, 



p x(x)p x - p x{x)Px , 



x:f(x)=0 x:/(x)=l 

Proof: It suffices to show that 



1 d 

Jew 



(2) 



for every / G J 7 ^. Let thus / be fixed and assume for 
simplicity that Pf/ x \(0) > and Pfpn(l) > (otherwise, 
Q is trivially satisfied). 

Let z 6 {0, 1}. Conditioned on the event that f(X) = z, the 
state p equals p x with probability Px\f(x) ( x \ z )- This situation 
can equivalently be described by saying that the system is in 
the mixed state a( € S(li.d), where 



rf - 



E Px\f(x)(x\z)p x 



x:f{x)=z 

The problem of guessing f(X) thus corresponds exactly to the 
quantum binary decision problem described in Section IIV-AI 
i.e., 



P S ues S (f(X)\p) 



pmax 
r Pf(x)(0) 



{4 A" 



where the second equality follows from Theorem^] Finally, 
since f(X) is binary, equation 10 follows from LemmaQ ■ 
The expression for the distance of F(X) from uniform 
provided by Lemma fTTI is generally difficult to evaluate. The 
following theorem gives a much simpler upper bound for this 
quantity. 9 

Theorem 12: Let X be a random variable with range X and 
let F be a random predicate on X. Let further {p x }xex C 

9 The main idea in the proof of Theorem 1121 is to replace occurrences of 
density operators by their squares. The resulting expressions correspond to 
classical collision probabilities, as used in the well-known classical analysis 
of privacy amplification. The application of Jensen's inequality corresponds 
to the transition from the variational to the Euclidean distance. In this sense, 
this proof can be seen as a generalization of the classical derivation. 



S(fid) be a family of states on a d-dimensional Hilbert space. 
Then 



d(F(X)\p x ;F) < ^ 



xPx' 




where A, 



2 F f ^p F [f(x) = f{x')} - 1, for x,x' e X. 



Proof: We set out from the equation 

1 



d 

d(F(X)\ Px ;F) = - i E pF [£\p f J \ 

3=1 



provided by Lemma ITTI Note that, for any / S F } 



■x 



\ 



5>J| 2 =dVtr(A?) 
j'=i 



where the inequality is Jensen's inequality (applied to the 
convex mapping x i— > x 2 ) and where the equality is a 
consequence of Schur's (in)equality (cf. Lemma I20> . which 
can be applied because A/ is Hermitian and thus also normal. 
We conclude that 



d(F(X)\p x ;F)<ldi E 



tr(A^)] 



- 2 



E [tr(A 2 f )] 



(3) 



where Jensen's inequality is applied once again. 
By the definition of A/ in Lemma fTTI we have 



tr(A 2 f ) 



Px {x)P X (x')tT(p x p x > ) 



E 

Z(x)=/(x') 

= E (%(x),/(x') - l)Px(x)Px(x / )tr(p x p x ,) , 

x,x r £X 

where 5 VtV i is the Kronecker delta 10 . The assertion then 
follows by taking the expectation of this expression over F 
and combining the result with Q. ■ 

If F is two-universal, the quantity on the right hand side 
of Theorem [21 can be bounded by an expression which is 
independent of the particular storage function. 

Corollary 13: Let X be a random variable with range X 
and let F be a two-universal random predicate on X. Then for 
every family {p x }xex C S(Hd) of states on a d-dimensional 
Hilbert space 



1 



d(F(X)\ Px ;F) < ^di 



ll^Px(x). 

x£X 

Proof: Since F is two-universal, the values X x ,x' ( as de- 
fined in Theoremll2> cannot be positive for any distinct x, x' E 
X. Since ti(p x p x ') > 0, we conclude that X x , x ' ti(p x p x /) < 
for x x'. Moreover, X XtX = 1 and tv(p x p x ) < 1, for any 
x G X. Combining these facts, the assertion follows directly 
from the upper bound given by Theorem El ■ 

10 $y t y> equals 1 if y = y' and otherwise. 
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Note that the expression under the square root is simply 
the collision probability Pc(X) of X. Hence, with the Renyi 
entropy R(X) = — \og 2 Pc(X), the above inequality can be 
rewritten as 



d(F(X)\ Px ;F)<±2-^ 



(4) 



where s is the number of qubits in which X is stored, i.e., 

{Pxjxex C S{U V ). 

C. Comparing Classical and Quantum Storage Devices 

Since orthogonal states of a quantum system can always be 
perfectly distinguished, a random variable X can always be 
stored and perfectly retrieved in a quantum storage device of 
dimension d as long as the size of the range of X does not 
exceed d. Hence, a classical s-bit storage device C 2 cannot 
be more powerful than a storage device Q 2 consisting of s 
qubits. Formally, this can be stated as follows. For any random 
variables X and S on X and {0, 1} S , respectively, there is a 
family of states {p x }xex C S(H 2 s) such that 

d(F(X)\SF) < d(F(X)\ Px ;F) , for any F € H{X y). 

(5) 

The following lemma shows that, on the other hand, a 
quantum storage device can indeed be more useful than a 
corresponding classical storage device. However, we will see 
later that this is only true for special cases, e.g., if the 
difference between the number n of bits to be stored and the 
capacity s of the storage device is small. 

Lemma 14: Let X be uniformly distributed over {0, l} 2 
and let F be a uniform balanced predicate on {0, l} 2 . Then 
for any random variable S on {0, 1} defined by a channel 



s\x, 



d(F(X)\SF) < ~ 



Similarly, for every family {p x } x e{o,i} 2 C S(ri 2 ) of quantum 
states on a 2-dimensional Hilbert space 



d{F{X)\p x ;F) < 



2VS 



0.289 



and there exists families {p x }xe{o,i} 2 C S(Ti 2 ) saturating this 
bound. 

Proof: By the convexity of the variational distance, it 
suffices to consider random variables S which depend in a 
deterministic way on X, that is, S = <p c (X) for some function 
(p c : {0, l} 2 — > {0, 1}. It can easily be verified (by an explicit 
calculation) that 

d(F(X)\<p c (X)F) < \ 

for any function <p c from {0,1} 2 to {0,1}, and that equality 
holds for (p c : (xi,x 2 ) > x% ■ x 2 (i.e., Lp c {x\,x 2 ) = 1 if 
and only if x\ — x 2 = 1). This proves the first (classical) 
statement of the lemma. 

For the second (quantum) statement, for the same rea- 
son as above, it suffices to consider pure states only. Let 
{|^)(V'a;|}a;G{o,i} 2 c S{li. 2 ) be an arbitrary family of pure 
quantum states. It follows from the linearity of the trace 



and Lemma |^ applied to the Hermitian operator A := 

T,xex IV'xXV'xI. that 

E K^IVv>| 2 > \X\ 2 /d. 
x,x'ex 

The bound d(F{X)\\^ x )(^x\] F) < 1/(2^3) can then be 
obtained from Theorem 1121 with Pf^p F [f(x) = f(x')] = ^ 
for distinct x,x' (implying X x ,x' = — §0- 

It remains to be proven that d(F(X)\\ipx)(ipx\', F) = 
l/(2v / 3) for a family of states {\ip x )(ipx\}xe{a,i} 2 C <S(W 2 )- 
Such states can be defined by setting |i/'oo)>|V'oi) IV'io) an d 
|^ii) to the vertices of a tetrahedron in V(Ti2) (or, more 
precisely, in the Bloch sphere which corresponds to T^?^))- 
The assertion then follows from a straightforward calculation. 

■ 

Together with Lemma Lemma [H] implies that the max- 
imum probability of correctly guessing a randomly chosen 
balanced predicate F about a random 2-bit string X is larger 
if information about X can be stored in one qubit (P q = 
0.789) than if this information is stored in one classical bit 
(P c — 0.75). Note that this is in accordance with earlier results 
showing that one individual qubit can be stronger than one 
classical bit (see, e.g., [4]). 

Surprisingly, this advantage of a quantum storage device 
becomes negligible if the difference n — s between the length 
n of the bitstring X and the number s of bits/qubits of the 
storage device becomes large. To see this, let us first state a 
lower bound for the distance of F(X) from uniform given the 
knowledge stored in a classical storage device. 

Lemma 15: Let X be uniformly distributed on {0, 1}™ and 
let F be a uniform random predicate on {0,1}". Then for any 
s < n there exists a random variable S on {0, 1} S defined by 
a channel Ps\x sucn that 



i<7(2 n - s ) < d(F(X)\SF) 



(6) 



where C(m) 
1 „ 



\m/2> 



l+0(-M).In particular, 



^(1 + 0{2- (n - s) )) < d(F{X)\SF) . 

Proof: Let ip be a function from {0, 1}™ to {0, 1} S 
such that for any w £ {0, 1} S , the set ip^ 1 ({w}) := {x E 
{0, 1}" : tp(x) = w} has size 2 n - s . We claim that S = ip(X) 
satisfies 0. 

For any fixed w € {0, 1} S and / S -^bin 1 ^ > 



d(f(X)\ip(X) = w) 



kf 



[f(X) = 0\<p(X) = w] 



where kf := \ f 1 ({0}) n ip 1 ({w})|. Since F is uniformly 
distributed on the set ^jn 1 ^ > we h ave ^f^P F [kf = k] = 
( 2n ~ s )2- 2 "~ ° for k £ {0, . . . , 2 n - s }, hence 



d (f(X)MX)= W )=^2\^-l 

fc=0 
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where the last equality follows from equation ( 1 1 4I > of 
Lemma l22l As w £ {0, 1} S was arbitrary, this concludes the 
proof. (The approximation for C(m) can be obtained from 
Lemma [23]) ■ 

Combining Lemma ^] with inequalities @ and (0, we 
conclude that the distance from uniform has the same asymp- 
totic behavior for the classical and the quantum case: The 
knowledge about the predicate F(X) decreases exponentially 
in the difference n — s between the length of the bitstring X 
and the size s of the storage device. 

More precisely, since, for n — s > 1, 



-C(2 n ~ 



~ 2 



it follows from Lemma IT31 and © that there exists a random 
variable S on {0, 1} S defined by a channel P$\x sucn that 
d{F{X)\SF) > d(F(X)\p x ;F) for any family of states 
{Px}xe{o,i} n C iS(7^ 2 s-i)- This means that storing informa- 
tion about X in s classical bits instead of s — 1 quantum bits 
allows to predict F(X) with a lower error probability. 

V. From the Binary to the Non-Binary Case 
A. Relations Between Bounds on Knowledge 

We start with a lemma bounding the distance of a random 
variable X from uniform by the distance of a binary hash value 
F(X) from uniform where F is a randomly chosen balanced 
predicate. This is related to the Vazirani XOR lemma (see 
e.g., [19]), which gives a similar bound for the case where F 
is chosen randomly from the set of all linear functions. 11 

Lemma 16 (Hashing Lemma): Let X be a random variable 
with range X and let F be a uniform balanced random 
predicate on X. Then 

d{X)<\j\X\d(F{X)\F) . 
Proof: For any probability distribution Q over X and any 
/ € T*^, let df(Q) := d(f(X')) be the distance between the 
uniform distribution and the distribution of f(X') where X' 
is a random variable distributed according to Q. We have to 
show that 



(7) 



d(Q) < = V\X\ t \ [d f (Q)} 



respectively. Note that, since d is convex, df is convex as 
well and thus so is its expected value 'Ef<_p p [df(-)] (i.e., the 
function defined by Q i— > Ey^p F [df(Q)]). 

Let us first show that inequality (0 holds for distributions Q 
over X where the probabilities only take two possible values, 
\Q(X)\ < 2, i.e., there exist a + > and a~ < such that 



a x (Q) 



for x £ X~^ and a x (Q) 



for x G X ' 



Q ' 

Then the value dj(Q) in (|9) only depends on the number 
k(f) := \X^ n X±\ of values a; € Af^ for which /(x) = 0. 

To get some intuition, consider the case where \X^\ = 
7}\X\. Since / is randomly chosen, the expected deviation of 
k(f) from its average value \\X\ is proportional to yp?"]. 
Furthermore, df(Q) is proportional to this deviation and a + , 
and a + is proportional to d(Q) and inverse proportional to \X\. 
Neglecting the constants, this already shows that @ holds in 
this particular case. 

Proving the exact statement Q requires a little bit more 
computation. For any predicate / £ expression (|9j reads 



df(Q) = \ E 



= |fe( / ) a + + (--fc(/))a-| 
where n := With s := | |, expression (|8} implies 



d(Q) 



and 



and hence 



d f (Q) = \d(Q)(k(f)( 
= d(Q)\k(f)- 



d(Q) 



n 1 



s n — s 
s , n 



2 n 



2' s(n-s) 

Consequently, for Q — Q, inequality is equivalent to 

1 n 



s -i> 2 



3v/n 



Since the term in the sum over J 7 *^ only depends on k(f), 
the sum can be replaced by a sum over k, i.e., we have to 
show that 



for any distribution Q over X. Defining the coefficients 

a x (Q) := Q(x) — tot, and the sets Xq := {x E X : a x (Q) > 
0} and Xq := X — Xq, we obtain 



d(Q) = e a M = - E ttl W) 



and, for any / e ^ and Xf :— {x £ X : f(x) = 0}, 
d f(Q)= I E a -^)l ' 



(8) 



(9) 



min(s,f) 

J n \— s f s\ f n — s \ s , 



(™) s(n-s) V1/ , , 



E 

(0, 

(f!) 2 s! (n-«)!n 



n! s(n — s) 



with 



min(s,^-) 

E 

k— max(0,— -j+s) 



3-y/n 



(10) 



k\ ( a _jfe)!(|-a + Jb)!(s_jfe)| 



11 The following version of Vazirani's XOR lemma is proved in [20]: 
d{X) < ^yjXl^Ei^p^dieiX)) 2 ], where P L is the uniform distribution 
on the set of all non-zero linear functions from X to {0, 1}. 



The term S n , s has different analytic solutions depending on 
whether s is even or odd. Let us first assume that s is even. 
Replacing the summation index k by k — k — | and making 



s 



use of the symmetry of the resulting terms with respect to the 
sign of fc, we get 



in (f 

E 

s(n - 



'-) 



(f + fc)! (| - fc)! + k)\ (2^ - k)\ 



2n(f!) 2 ( I T £ !) 2 ' 

where the second equality follows from equation ( 1151 of 
Lemma |22] with a = § and b — A straightforward 

calculation then shows that for fixed n the minimum of the 
left hand side of the inequality in JlOi is taken for s as close 
as possible to j, i.e., s = 2|_J J and n — s = 2\j\, that is 



(f!) 2 s!(n-s)!?i 
n! s(n — s) 



S n 



> 



> 



§! 2 s!(n- a)! 
2n!(§!) 2 (^!) 2 
|! 2 (2LfJ)!(2rfl)! 



2«!(LfJ!) 2 (rtl0 2 ' 

Lemma [23] is then used to derive a lower bound for the term 
on the right hand side of this inequality, leading to 



(§!) 2 s!(n-s)!n 
n\ s(n — s) 



Sn 



> 



6r 1 + l + 24Lf J+1^24ff 1+1 



+ - 



where the last inequality holds for n > 6. 

Similarly, for s odd, applying equation (II 61 of Lemma [22] 
with a = and b = "^jp 1 leads to 



S,, 



min(a,b) 

^ E 



fe=0 



{a + k + 1)! (a - fc)! (6 + fc + 1)! (6 - fc)! 



n(£ _i, )2( „i !)2 



resulting in the same lower bound for the left hand side 
of the inequality in dlOi for n > 8. Moreover, an explicit 
calculation shows that i ll Ob also holds for n = 2, n = 4, and 
rt = 6 which concludes the proof of inequality for Q = Q 
with |Q(Af)| < 2. 

Let now Q be an arbitrary distribution on X and let T be 
the set of permutations on X with invariant sets Xq and Xq, 
i.e., 7(A"+) = A"+ and j{X Q ) = X Q , for 7 e I\ Since 
d(Q) = d(Q o 7) for 7 e T, we find that 

1 x - 



Q 



Q07 



is a probability distribution satisfying d(Q) = d(Q) and taking 
identical probabilities for all elements in Xq as well as for 
all elements in Xq, i.e., |<3(<-f)| < 2. Since inequality is 
already proven for distributions of this form, we conclude 



d(Q) = d(Q) < - 



\X\ f E p [df{Q)\ 

J * *F 



iri ^/-jv 



where the second inequality is a consequence of the convexity 
of Ef^p F [df(-)]. Assertion ([7) then follows from df(Q 



-i(Q), for all / € J 7 *,, 7 <E T, and the fact 



7) = C?/o 7 

that F o 7 _1 is a uniform balanced random predicate, i.e., 
E /< _ JV [d /07 -i(Q)]=E /< _ iV [d / (Q)]. ' ■ 

In order to apply the hashing lemma to generalize the 
results of the previous section to the non-binary case, we 
need a relation between binary random functions (i.e., random 
predicates) and non-binary random functions. 

Lemma 17: Let G be a two-universal random function from 
X to y and let F be a uniform balanced random predicate on 
y. Then the random predicate H := F o G is two-universal. 
Proof: For any distinct a:, e X, 

P [/i(z) = fc(.x')] = P p [<?(x) = <?(*')] 
+ (1- P [ 5 (x) = S (aO]) , P p [/(<?(*)) = /(<?(*'))] 

g^ P G\G{x)j=G{x') 

Note that P/^p^p g|g( ^ g(i0 [/(s(z)) = /(ff(a/))] is the 
collision probability of the uniform balanced random predicate 
F, F f ^p F [f(y) = f(y')\ (for distinct y,y' e y), which can 
easily be computed, 

' Since G is two-universal, i.e., P g ^p G [5(2;) = g{x')] < jhr, 



we have 



hi-Pj 



[h(x) = h(x')] 

= P [g(x)=g(x')}(l- P [/(y) = /(!/')]) 
g^Po }<—Pf 

+ f P p [/(») = /(</)] 

<^ + (i-^) / p p ;/(.) = /(,')] 
= J_ + (1 -J_) i^i- 2 = i 

i^r 1 13^1 ^ 2 (13^1 — 1) 2' 

i.e., the random predicate H is two-universal. ■ 
Combining Lemma and Lemma ^] leads to a relation 
between the distance from uniform of the outcomes of binary 
and general (non-binary) two-universal functions on a random 
variable X, given some knowledge W5. 12 

Theorem 18: Let X and S be random variables on X and 
S, respectively and let W be a selectable channel on S. If, 
for all two-universal random predicates H on X, 

d(H(X)\W s :H) <e , (11) 

then, for all two-universal random functions G from X to y, 



d(G(X)\W s ;G)<~V\y\e. (12) 
Proof: From Definition we have 

d(G(X)\W s ;G) = g E pc [maxd(g(X)\W s )] 

12 Using the version o f Va zirani's XO R-Le mma stated in Footnote II II the 
constant f in the bound 1121 of Theorem 1 181 can be eliminated by replacing 

condition 1111 by the stronger requirement » /E h _p JJ [d(h(X)\~W g) 2 ] < s. 
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The expression in the maximum can then be bounded using 
Lemma [T6l that is 

d(g(X)\W s ) < I V\y\d(F(g(X))\W s F). 
This leads to 

d(G(X)\W s ;G) <lVW\ E p \^d(F(g(X))\W s F)] 

g^Pc 

Defining H := F o G, we obtain 

d(G(X)\W s ;G) <\^W\ h Ej« d(h(X)\W s )] 

= lV\y\d(H(X)\W s ;H). 

Finally, Lemma El states that H is a two-universal random 
predicate on X, hence the assertion of the theorem follows. 



B. Application: Privacy Amplification with a Quantum Adver- 
sary 

Consider two parties, Alice and Bob, being connected by 
an authentic but otherwise completely insecure communica- 
tion channel. Assume that they initially share a uniformly 
distributed n-bit key X about which an adversary Eve has 
some partial information, where the only bound known on 
Eve's information is that it consists of no more than s bits. 
Privacy amplification, introduced by Bennett, Brassard, and 
Robert [7], is a method to transform X into an almost perfectly 
secure key K. It has been shown that if Alice and Bob publicly 
(by communication over the insecure channel) choose a two- 
universal random function G mapping the n-bit string to an 
fc-bit string K = G(X), for k smaller than n — s, then 
the resulting string K is secure (i.e., Eve has virtually no 
information about K). Note that n—s is roughly Eve's entropy 
about the initial string X, i.e., privacy amplification with 
two-universal random functions is asymptotically optimal with 
respect to the number of extractable key bits. In our formalism, 
the possibility of privacy amplification by applying a (two- 
universal) random function G, as proved in [7] (a simplified 
proof has been given in [8]), reads 

d(G(X)\SG) = 0(2-"^^) (13) 

for any random variable S on {0, 1} S defined by a channel 
Ps\x- 

Combining the results from the previous section, we obtain 
a similar statement for the situation where Eve's knowledge 
about X is stored in s quantum instead of s classical bits. 
More precisely, we can derive a bound on the distance of the 
final key K = G(X) from uniform, from an adversary's point 
of view, where G is a two-universal random function applied 



to an initial string X, assuming only that the adversary's 
knowledge about X is stored in a limited number s of qubits. 13 
Corollary 19: Let X be a random variable with range X 
and Renyi entropy R{X) = n and let G be a two-universal 
random function from X to {0, 1} . Then, for any family of 
states {p x } xeX C S(H 2 s) 

d(G(X)\ Px ;G)<^2~^ . 
Proof: Theorem together with Corollary ^] implies 

d(G(X)\p x ;G) < j /2*-2' 

for any family of states {p x }xex C S(H.2=)- The corollary 
then follows from the definition of the Renyi entropy (cf. 
remark after the proof of Corollary II 3>. ■ 
We thus have a quantum analogue to (1131 . implying that 
privacy amplification remains equally secure (with the same 
parameters) if an adversary has quantum rather than only 
classical bits to store her information. Note that a similar 
bound follows from [13] together with a result of [5], for the 
case where G is the inner product with a randomly chosen 
string. 

This generalization of the security proof of privacy amplifi- 
cation immediately extends a result by Csiszar and Korner [21] 
(see also [22]) to the quantum case. Consider a situation where 
Alice and Bob share information described by N independent 
realizations of random variables X and Y, respectively, and 
where Eve has information described by realizations of a 
classical random variable Z. The result of [21] says that the 
number of secret key bits that can be generated by one-way 
communication (from Alice to Bob) over a public channel 
is at least (roughly) N(I(X;Y) - I(X;Z)), for large N. 
The protocol that Alice and Bob have to apply consists of an 
error correction step followed by a privacy amplification step 
using a two-universal random function. If we now consider 
a situation where Eve holds s qubits of quantum information 
about X, it follows immediately from Corollary ^] that the 
same protocol can be used to generate a secret key of length 
roughly N(I(X; Y) - s). 

In most QKD protocols, Alice encodes some classical 
information X into the state of a quantum system and sends 
it to Bob. Upon receiving this state, Bob applies a measure- 
ment, resulting in classical information Y, After this step, 
the adversary might hold some quantum information about 
X and Y. The situation is thus characterized by classical 
random variables X and Y together with the quantum system 
of Eve, where the size of her system depends on the error rate 
tolerated by the protocol (see [10]). Hence, the generalization 
of the Csiszai-Korner bound described above directly gives an 

13 Note that this is an example illustrating the fact that a bound on the 
expected distance of a single bit H(X) from uniform d(H (X)\ Ws; H) 
suffices to derive bounds on the expected distance from uniform 
d(G(X)\W s; G) of a long key G(X) obtained by privacy amplification. In 
the case of quantum knowledge, however, it is possible to prove even stronger 
statement s fo r the single-bit case, resulting in a strengthened version of 
Corollary |13l which gives a b ound on a quantity similar to d(H (X)\'W g; H) 
. Using this and Footnote ^| the constant | in Corollary ^]can be replaced 
by I 
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expression for the amount of key that can be generated by the 
protocol. In particular, it proves that the security holds against 
any type of attack (including coherent measurements on Eve's 
whole quantum system). 

VI. Conclusions and Open Problems 

It is a fundamental question whether s quantum bits are 
more powerful than s classical bits in order to store infor- 
mation about an n-bit value X (for n > s). We considered 
the problem of answering a randomly chosen question F about 
X, given only the stored information about X. The uncertainty 
about the answer F(X) is then a measure for the usefulness 
of the stored information. It can be quantified in terms of 
the distance of F(X) from uniform conditioned on the stored 
information, which, for binary questions F, corresponds to the 
advantage over 1/2 of the success probability when guessing 
F(X). It turns out that when storing a bitstring X of length 
n = 2 bits, one quantum bit can indeed be more useful than 
one classical bit (cf. Lemma IT4l. However, for larger values of 
n — s, the difference between classical and quantum memory 
becomes inessential. 14 

We have shown that this has interesting implications for 
cryptography. In particular, privacy amplification by two- 
universal hashing remains secure even against adversaries 
holding quantum information (cf. Corollary 1191 . This also 
leads to conceptually simpler and more general security proofs 
for quantum key distribution, where privacy amplification is 
used for the classical post-processing of the raw key (cf. [10], 
[13]). 

It is well-known that so-called strong extractors [9] can be 
used to do privacy amplification in the classical case. While 
two-universal hashing can be seen as special case of this, 
the converse generally does not hold. It is an open problem 
whether strong extractors are sufficient to generate a key which 
is secure against a quantum adversary in general. 
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Appendix 

Lemma 20 (Schur's inequality): Let A be a linear operator 
on a d-dimensional Hilbert space Hd and let be its 

eigenvalues. Then 

d 

£|Mi| 2 <tr(^t), 

i=l 

with equality if and only if A is normal (i.e., AA^ = A^A). 
Proof: See, e.g., [23]. ■ 
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Lemma 21: Let A be a normal operator on a d-dimensional 
Hilbert space Hd- Then 

|trL4)| 2 < d-tr(AAt) . 
Proof: Since ^4 is normal, we have 

d d 

tv{A) = J2^i and tr(AA^) = ^ | Mi | 2 , 

where {^i}f=i aie tne eigenvalues of A. The assertion then 
follows from Jensen's inequality stating that 

|2 



=1 



Lemma 22: Let a, 6 € N. Then the following equalities 
hold: 



2a 



z=0 
min(a,6) 

Ez ab 
(a + z)\ (a - z)\ (b + z)\ (b - z)\ = 2{a + b) (a!) 2 (6!) 2 

(15) 



min(a,6) i 



E 



— (a + z + l)!(a-z)!(6 + z + l)!(6-z)! (lfi) 



2(a + fe+l)(a!) 2 (6!) 2 ' 
Proof: The first equality follows from a straightforward 

calculation, using the identity ( a ) • - = ("Zj. The second 
and the third equality can be obtained with Zeilberger's algo- 
rithm [24] which is implemented in many standard computer 
algebra systems (e.g., Mathematica or Maple). ■ 
Lemma 23 (Stirling's approximation): For n G N, 

V2^n n+ h- n+ ^TT <n \< V2^n n+ h' n+ T^ . 
Proof: A proof of this extension of Stirling's approxi- 
mation can be found in [25]. ■ 



